2,347 research outputs found

    Straight to Shapes: Real-time Detection of Encoded Shapes

    Full text link
    Current object detection approaches predict bounding boxes, but these provide little instance-specific information beyond location, scale and aspect ratio. In this work, we propose to directly regress to objects' shapes in addition to their bounding boxes and categories. It is crucial to find an appropriate shape representation that is compact and decodable, and in which objects can be compared for higher-order concepts such as view similarity, pose variation and occlusion. To achieve this, we use a denoising convolutional auto-encoder to establish an embedding space, and place the decoder after a fast end-to-end network trained to regress directly to the encoded shape vectors. This yields what to the best of our knowledge is the first real-time shape prediction network, running at ~35 FPS on a high-end desktop. With higher-order shape reasoning well-integrated into the network pipeline, the network shows the useful practical quality of generalising to unseen categories similar to the ones in the training set, something that most existing approaches fail to handle.Comment: 16 pages including appendix; Published at CVPR 201

    Deep Learning for Detecting Multiple Space-Time Action Tubes in Videos

    Get PDF
    In this work, we propose an approach to the spatiotemporal localisation (detection) and classification of multiple concurrent actions within temporally untrimmed videos. Our framework is composed of three stages. In stage 1, appearance and motion detection networks are employed to localise and score actions from colour images and optical flow. In stage 2, the appearance network detections are boosted by combining them with the motion detection scores, in proportion to their respective spatial overlap. In stage 3, sequences of detection boxes most likely to be associated with a single action instance, called action tubes, are constructed by solving two energy maximisation problems via dynamic programming. While in the first pass, action paths spanning the whole video are built by linking detection boxes over time using their class-specific scores and their spatial overlap, in the second pass, temporal trimming is performed by ensuring label consistency for all constituting detection boxes. We demonstrate the performance of our algorithm on the challenging UCF101, J-HMDB-21 and LIRIS-HARL datasets, achieving new state-of-the-art results across the board and significantly increasing detection speed at test time. We achieve a huge leap forward in action detection performance and report a 20% and 11% gain in mAP (mean average precision) on UCF-101 and J-HMDB-21 datasets respectively when compared to the state-of-the-art.Comment: Accepted by British Machine Vision Conference 201

    Optical pulse propagation in a switched-on photonic lattice: Rabi effect with the roles of light and matter interchanged

    Full text link
    A light pulse propagating in a suddenly switched on photonic lattice, when the central frequency lies in the photonic band gap, is an analog of the Rabi model where the two-level system is the two resonant (i.e. Bragg-coupled) Fourier modes of the pulse, while the photonic lattice serves as a monochromatic external field. A simple theory of these Rabi oscillations is given and confirmed by the numerical solution of the corresponding Maxwell equations. This is a direct, i.e. temporal, analog of the Rabi effect, additionally to the spatial analog in optical beam propagation described in Opt. Lett. 32, 1920 (2007). An additional high-frequency modulation of the Rabi oscillations reflects the lattice-induced energy transfer between the electric and magnetic fields of the pulse.Comment: 3 pages, 5 figure

    InfiniTAM v3: A Framework for Large-Scale 3D Reconstruction with Loop Closure

    Full text link
    Volumetric models have become a popular representation for 3D scenes in recent years. One breakthrough leading to their popularity was KinectFusion, which focuses on 3D reconstruction using RGB-D sensors. However, monocular SLAM has since also been tackled with very similar approaches. Representing the reconstruction volumetrically as a TSDF leads to most of the simplicity and efficiency that can be achieved with GPU implementations of these systems. However, this representation is memory-intensive and limits applicability to small-scale reconstructions. Several avenues have been explored to overcome this. With the aim of summarizing them and providing for a fast, flexible 3D reconstruction pipeline, we propose a new, unifying framework called InfiniTAM. The idea is that steps like camera tracking, scene representation and integration of new data can easily be replaced and adapted to the user's needs. This report describes the technical implementation details of InfiniTAM v3, the third version of our InfiniTAM system. We have added various new features, as well as making numerous enhancements to the low-level code that significantly improve our camera tracking performance. The new features that we expect to be of most interest are (i) a robust camera tracking module; (ii) an implementation of Glocker et al.'s keyframe-based random ferns camera relocaliser; (iii) a novel approach to globally-consistent TSDF-based reconstruction, based on dividing the scene into rigid submaps and optimising the relative poses between them; and (iv) an implementation of Keller et al.'s surfel-based reconstruction approach.Comment: This article largely supersedes arxiv:1410.0925 (it describes version 3 of the InfiniTAM framework

    OCVD Measurement of Ambipolar and Minority Carrier Lifetime in 4H-SiC Devices: Relevance of the Measurement Setup

    Get PDF
    The open-circuit voltage decay (OCVD) method is a well-known technique for conducting electrical measurements of carrier lifetime: the main advantages lie in the simple setup and the possibility of carrying out measurements in commercial devices without the need of removing the package, as for optical methods. Despite several researchers having reported carrier lifetimes measured by the OCVD method in different devices, there has been little discussion about the potential effect of the experimental setup on the obtained results. By comparing the outputs of the experimental measurements with those of numerical simulations, this study investigates the overlooked effect of the OCVD measurement setup on the former. Due to the growing importance of SiC-based devices, the analysis is applied to a 4H-SiC p-i-n diode. Two main points are addressed: 1) the effect of circuit setup on the ambipolar lifetime is discussed and a method, originally developed for improving the estimate of low-level carrier lifetime in OCVD measurements, is used to correct the measured lifetime for this influence; 2) the origin of the local minimum eventually appearing in the lifetime versus time curves is also investigated. It is found that the minimum can also be related to the time constant of the experimental setup, giving rise to doubts about the usual interpretation of this minimum as the minority carrier lifetime. A method is thus proposed to help discriminate between the two interpretations

    Objective-free excitation of quantum emitters with a laser-written micro parabolic mirror

    Get PDF
    The efficient excitation of quantum sources such as quantum dots or single molecules requires high NA optics which is often a challenge in cryogenics, or in ultrafast optics. Here we propose a 3.2 um wide parabolic mirror, with a 0.8 um focal length, fabricated by direct laser writing on CdSe/CdS colloidal quantum dots, capable of focusing the excitation light to a sub-wavelength spot and to extract the generated emission by collimating it into a narrow beam. This mirror is fabricated via in-situ volumetric optical lithography, which can be aligned to individual emitters, and it can be easily adapted to other geometries beyond the paraboloid. This compact solid-state transducer from far-field to the emitter has important applications in objective-free quantum technologies
    corecore